Augmented Skeleton Based Contrastive Action Learning with Momentum LSTM for Unsupervised Action Recognition
نویسندگان
چکیده
Action recognition via 3D skeleton data is an emerging important topic. Most existing methods rely on hand-crafted descriptors to recognize actions, or perform supervised action representation learning with massive labels. In this paper, we for the first time propose a contrastive paradigm named AS-CAL that exploits different augmentations of unlabeled sequences learn representations in unsupervised manner. Specifically, contrast similarity between augmented instances input sequence, which are transformed multiple novel augmentation strategies, inherent patterns (“pattern-invariance”) transformations. Second, encourage pattern-invariance more consistent representations, momentum LSTM, implemented as momentum-based moving average LSTM based query encoder, encode long-term dynamics key sequence. Third, introduce queue store encoded keys, allows flexibly reusing proceeding keys build dictionary facilitate learning. Last, Contrastive Encoding (CAE) represent human’s effectively. Empirical evaluations show our approach significantly outperforms by 10–50% Top-1 accuracy, and it can even achieve superior performance many (Our codes available athttps://github.com/Mikexu007/AS-CAL).
منابع مشابه
Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks
Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an endto-end fully connected deep LSTM n...
متن کاملSkeleton-based action recognition with extreme learning machines
6 Action and gesture recognition from motion capture and RGB-D camera sequences 7 has recently emerged as a renowned and challenging research topic. The current 8 methods can usually be applied only to small datasets with a dozen or so different 9 actions, and the systems often require large amounts of time to train the models 10 and to classify new sequences. In this paper, we first extract si...
متن کاملSkeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates
Skeleton-based human action recognition has attracted a lot of research attention during the past few years. Recent works attempted to utilize recurrent neural networks to model the temporal dependencies between the 3D positional configurations of human body joints for better analysis of human activities in the skeletal data. The proposed work extends this idea to spatial domain as well as temp...
متن کاملFusing Geometric Features for Skeleton-Based Action Recognition using Multilayer LSTM Networks
Recent skeleton-based action recognition approaches achieve great improvement by using RNN models. Currently these approaches build an end-to-end network from coordinates of joints to class categories and improve accuracy by extending RNN to spatial domains. First, while such well-designed models and optimization strategies explore relations between different parts directly from joint coordinat...
متن کاملA Fusion of Appearance based CNNs and Temporal evolution of Skeleton with LSTM for Daily Living Action Recognition
In this paper, we propose efficient method which combines skeleton information and appearance features for daily-living action recognition. Many RGB methods focus only on short term temporal information obtained from optical flow. Skeleton based methods on the other hand show that modeling long term skeleton evolution improves action recognition accuracy. In this paper we propose to fuse skelet...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Sciences
سال: 2021
ISSN: ['0020-0255', '1872-6291']
DOI: https://doi.org/10.1016/j.ins.2021.04.023